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Abstract 

Any NLO calculation of a QCD final-state observable involves Monte Carlo in- 
ly-^ ■ tegration over a large number of events. For DIS and hadron colliders this must 
O , usually be repeated for each new PDF set, making it impractical to consider 
O I many 'error' PDF sets, or carry out PDF fits. Here we discuss "a posteriori" 
^ ' inclusion of PDFs, whereby the Monte Carlo run calculates a grid (in x and Q) 
O . of cross section weights that can subsequently be combined with an arbitrary 
O ! PDF. The procedure is numerically equivalent to using an interpolated form 
l/-^ ' of the PDF. The main novelty relative to prior work is the use of higher-order 
CN . interpolation, which substantially improves the tradeoff between accuracy and 
^ i memory use. An accuracy of about 0.01% has been reached for the single in- 
^ I elusive cross-section in the central rapidity region \y\ < 0.5 for jet transverse 
'nI" . momenta from 100 to SOOOGeV. This method should facilitate the consistent 
^ , inclusion of final-state data from HERA, Tevatron and LHC data in PDF fits, 
O ! thus helping to increase the sensitivity of LHC to deviations from standard 
^ ■ Model predictions. 

O ; 

r-| ■ 1 Introduction 

The Large Hadron Collider (LHC), currently under construction at CERN, will collide protons on pro- 
tons with an energy of 7 TeV. Together with its high collision rate the high available centre-of-mass 
(— I ! energy will make it possible to test new interactions at very short distances that might be revealed in the 

' production cross-sections of Standard Model (SM) particles at very high transverse momentum (Pt) as 

• ■ deviation from the SM theory. 

j_j ■ The sensitivity to new physics crucially depends on experimental uncertainties in the measure- 

. 5^ , ments and on theoretical uncertainties in the SM predictions. It is therefore important to work out a 

strategy to minimize both the experimental and theoretical uncertainties from LHC data. For instance, 
one could use single inclusive jet or Drell-Yan cross-sections at low Pt to constrain the PDF uncertain- 
ties at high Pt- Typical residual renormalisation and factorisation scale uncertainties in next-to-leading 
order (NLO) calculations for single inclusive jet-cross-section are about 5 — 10% and should hopefully 
be reduced as NNLO calculations become available. The impact of PDF uncertainties on the other hand 
can be substantially larger in some regions, especially at large Pt, and for example at Pt = 2000 GeV 
dominate the overall uncertainty of 20%. If a suitable combination of data measured at the Tevatron and 
LHC can be included in global NLO QCD analyses, the PDF uncertainties can be constrained. 

The aim of this contribution is to propose a method for consistently including final-state observ- 
ables in global QCD analyses. 

For inclusive data like the proton structure function F2 in deep-inelastic scattering (DIS) the per- 
turbative coefficients are known analytically. During the fit the cross-section can therefore be quickly 
calculated from the strong coupling (as) and the PDFs and can be compared to the measurements. How- 
ever, final state observables, where detector acceptances or jet algorithms are involved in the definition of 
the perturbative coefficients (called "weights" in the following), have to be calculated using NLO Monte 
Carlo programs. Typically such programs need about one day of CPU time to calculate accurately the 
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cross-section. It is therefore necessary to find a way to calculate the perturbative coefficients with high 
precision in a long run and to include ag and the PDFs "a posteriori". 

To solve this problem many methods have been proposed in the past [1-4,4, 6]. In principle the 
highest efficiencies can be obtained by taking moments with respect to Bjorken-x [1,2], because this 
converts convolutions into multiplications. This can have notable advantages with respect to memory 
consumption, especially in cases with two incoming hadrons. On the other hand, there are complications 
such as the need for PDFs in moment space and the associated inverse Mellin transforms. 

Methods in x-space have traditionally been somewhat less efficient, both in terms of speed (in 
the 'a posteriori' steps — not a major issue here) and in terms of memory consumption. They are, 
however, somewhat more transparent since they provide direct information on the x values of relevance. 
Furthermore they can be used with any PDF. The use of x-space methods can be further improved by 
using methods developed originally for PDF evolution [7,8]. 



2 PDF-independent representation of cross-sections 
2.1 Representing tlie PDF on a grid 

We make the assumption that PDFs can be accurately represented by storing their values on a two- 
dimensional grid of points and using n*'^-order interpolations between those points. Instead of using 
the parton momentum fraction x and the factorisation scale Q^, we use a variable transformation that 
provides good coverage of the full x and range with uniformly spaced grid points:^ 

1 

j/(x) = ln— and rfQ^) = In In —5-. (1) 



X 



The parameter A is to be chosen of the order of Aqcd> but not necessarily identical. The PDF q{x,Q'^) 
is then represented by its values Qiy^i^ at the 2-dimensional grid point {iy 6y, ir St), where 6y and 6t 
denote the grid spacings, and obtained elsewhere by interpolation: 

where n, n' are the interpolation orders. The interpolation function J^- {u) is 1 for u = i and otherwise 
is given by: 

i\[n — i)\ u — I 

Defining int(u) to be the largest integer such that int(u) < u, k and k are defined as: 

k{x)= int(f)-I^), K{x)=int(^^-^y (4) 

Given finite grids whose vertex indices range from . . . A'^y — 1 for the y grid and . . . A'^^ — 1 for the r 
grid, one should additionally require that eq. ^ only uses available grid points. This can be achieved by 
remapping k max(0, mm{Ny — 1 — n, k)) and k —>■ max(0, min(A^T- — 1 — n', k)). 

2.2 Representing the final state cross-section weights on a grid (DIS case) 

Suppose that we have an NLO Monte Carlo program that produces events m = 1 . . . A^. Each event m 
has an x value, x^n, a value, Q"^, as well as a weight, Wm, and a corresponding order in q^, pm- 



'An alternative for the x grid is to use y = In 1/x- + ci(l — a;) with a a parameter that serves to increase the density of points 
in the large x region. 
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Normally one would obtain the final result W of the Monte Carlo integration from: 

w^=E^-(^^) (5) 



m=l 



Instead one introduces a weight grid VFj^ and then for each event updates a portion of the grid 

with: 

z = . . . n, t = . . . n' : 

- <«,L. + "-^ 4"' - *) A'"'' - «) . (« 

where A; = k{x^^ k = k{Q^). 
The final result for W, for an arbitrary PDF, can then be obtained subsequent to the Monte Carlo run: 

la (O'^^'-'W 
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where the sums index with iy and ir run over the number of grid points and we have have explicitly 
introduced x'^^y^ and Q^^*"^^ such that: 

y{x^'v)) = iy Sy and r (q^^*-) j = St. (8) 
2.3 Including renormalisation and factorisation scale dependence 

(v) 

If one has the weight matrix '^^ determined separately order by order in ag, it is straightforward 
to vary the renormalisation /x^ and factorisation fip scales a posteriori (we assume that they were kept 
equal in the original calculation). 

It is helpful to introduce some notation relating to the DGLAP evolution equation: 

^ . -|!> ,P„ , , (f^) ' (P. , , . . . . 

where the Pq and Pi are the LO and NLO matrices of DGLAP splitting functions that operate on vectors 
(in flavour space) q of PDFs. Let us now restrict our attention to the NLO case where we have just two 
values of p, pLO and pnlo- Introducing and corresponding to the factors by which one varies /Xi? 
and fip respectively, for arbitrary and we may then write: 



a. 



27r 



PNLO 

+ InCl (10) 



where /3o = (llATc— 2n/)/(127r) and Arc(n/) is the number of colours (flavours). Though this formula is 
given for x-space based approach, a similar formula applies for moment-space approaches. Furthermore 
it is straightforward to extend it to higher perturbative orders. 

^Here, and in the following, renormalisation and factorisation scales have been set equal for simplicity. 



2,4 Representing the weights in the case of two incoming hadrons 

In hadron-hadron scattering one can use analogous procedures with one more dimension. Besides Q^, 
the weight grid depends on the momentum fraction of the first (xi) and second {xi) hadron. 

In the case of jet production in proton-proton colUsions the weights generated by the Monte Carlo 
program as well as the PDFs can be organised in seven possible initial state combinations of partons: 



gg : 




X2 


Q') 


= Gi(xi)G2(x2) 




(11) 


qg : 




X2 


Q') 


= (Qi(xi)+Qi(xi))G2(x2) 




(12) 


gq : 




X2 




= G'i(xi) (Q2(x2) + Q2(a;2)) 




(13) 


qr : 




X2 


Q') 


= Ql(xi)(32(x2) + (5i(a;i)Q2(^2) 


- D{xi,X2) 


(14) 


qq : 




X2 


Q') 


= L'(xi,X2) 




(15) 


qq : 




X2 


Q') 


= i:>(xi,X2) 




(16) 


qr : 




X2 




= Qlixi)Q2ix2) + Qi{xi)Q2{x2) 


- D{xi,X2), 


(17) 



where g denotes gluons, q quarks and r quarks of different flavour g / r and we have used the generalized 
PDFs defined as: 



i=l j=— 6 

6 

Z)(X1,X2) = E fi/H,{xi,Q^)h,H,{x2,Q^), (18) 



i=-6 
6 

D{xl,X2,^l\) = E fi/H^{xi,Q^)f_ilH^{x2,Q^^ 
i=-6 



where /^/^ is the PDF of flavour i = —6 ... 6 for hadron H and Hi {H2) denotes the first or second 
hadron^ . 

The analogue of eq.Qis then given by: 



^=EEEEE<^iv ' ] F«(x^^x^^Q^(-)). (i9) 



P /=0 i 
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2.5 Including scale depedence in the case of two incoming hadrons 

It is again possible to choose arbitrary renormalisation and factorisation scales, specifically for NLO 
accuracy: 



2 /o2(«^) 



W{in,iF) = EEEE ( ) (xf-\x?-\e^Q^(-)) + 



Plo 



=0 iy, i 



Vl '!/2 



2, , l(<%^!- +2./3oPLoln^|<^°)^5t)F«(.^),4-),e2Q^ (20) 

where Fq^-,Pf,(^q^ is calculated as F*^'), but with qi replaced wtih P^^qi, and analogously for F^'^^p^^^^^. 

^In the above equation we follow the standard PDG Monte Carlo numbering scheme [9] where gluons are denoted as 0, 
quarks have values from 1-6 and anti-quarks have the corresponding negative values. 



3 Technical implementation 

To test the scheme discussed above we use the NLO Monte Carlo program NLOJET++ [10] and the 
CTEQ6 PDFs [11]. The grid w}^^f^ .-^ of eq.[T9lis filled in a NLOJET++ user module. This module 
has access to the event weight and parton momenta and it is here that one specifies and calculates the 
physical observables that are being studied (e.g. jet algorithm). 

Having filled the grid we construct the cross-section in a small standalone program which reads 
the weights from the grid and multiplies them with an arbitrary as and PDF according to eq. ^] This 
program runs very fast (in the order of seconds) and can be called in a PDF fit. 

The connection between these two programs is accomplished via a C++ class, which provides 
methods e.g. for creating and optimising the grid, filling weight events and saving it to disk. The classes 
are general enough to be extendable for the use with other NLO calculations. 

The complete code for the NLOJET++ module, the C++ class and the standalone job is available 
from the authors. It is still in a development, testing and tuning stage, but help and more ideas are 
welcome. 

3.1 The C++ class 

The main data members of this class are the grids implemented as arrays of three-dimensional ROOT 
histograms, with each grid point at the bin centers^: 

TH3D[p][l][iobs](xi,X2,Q'), (21) 

where the I and p are explained in eq.^Jand iohs denotes the observable bin, e.g. a given Pt range^. 
The C++ class initialises, stores and fills the grid using the following main methods: 

• Default constructor: Given the pre-defined kinematic regions of interest, it initializes the grid. 

• Optimizing method: Since in some bins the weights will be zero over a large kinematic region in 
xi, X2, Q^, the optimising method implements an automated procedure to adapt the grid bound- 
aries for each observable bin. These boundaries are calculated in a first (short) run. In the present 
implementation, the optimised grid has a fixed number of grid points. Other choices, like a fixed 
grid spacing, might be implemented in the future. 

• Loading method: Reads the saved weight grid from a ROOT file 

• Saving method: Saves the complete grid to a ROOT file, which will be automatically compressed. 

3.2 The user module for NLOJET++ 

The user module has to be adapted specifically to the exact definition of the cross-section calculation. If a 
grid file already exists in the directory where NLOJET++ is started, the grid is not started with the default 
constructor, but with the optimizing method (see l3.lt . In this way the grid boundaries are optimised for 
each observable bin. This is necessary to get very fine grid spacings without exceeding the computer 
memory. The grid is filled at the same place where the standard NLOJET++ histograms are filled. After 
a certain number of events, the grid is saved in a root-file and the calculation is continued. 

''root histograms are easy to implement, to represent and to manipulate. They are therefore ideal in an early development 
phase. An additional advantage is the automatic file compression to save space. The overhead of storing some empty bins 
is largely reduced by optimizing the xi, X2 and grid boundaries using the NLOJET++ program before final filling. To 
avoid this residual overhead and to exploit certain symmetries in the grid, a special data class (e.g. a sparse matrix) might be 
constructed in the future. 

^For the moment we construct a grid for each initial state parton configuration. It will be easy to merge the qg and the gq 
initial state parton configurations in one grid. In addition, the weights for some of the initial state parton configurations are 
symmetric in x\ and x^. This could be exploited in future applications to further reduce the grid size. 



3.3 The standalone program for constructing the cross-section 

The standalone program calculates the cross-section in the following way: 

1. Load the weight grid from the ROOT file 

2. Initialize the PDF interface^, load q{x,Q'^) on a helper PDF-grid (to increase the performance) 

3. For each observable bin, loop over iy^ , iy2,iT, l,P and calculate F\xi, X2,Q'^) from the appropri- 
ate PDFs q{x, Q^), multiply as and the weights from the grid and sum over the initial state parton 
configuration I, according to ea.[T9l 

4 Results 

We calculate the single inclusive jet cross-section as a function of the jet transverse momentum (Ft) 
for jets within a rapidity of \y\ < 0.5. To define the jets we use the seedless cone jet algorithm as im- 
plemented in NLOJET-i~i- using the four-vector recombination scheme and the midpoint algorithm. The 
cone radius has been put to i? = 0.7, the overlap fraction was set to / = 0.5. We set the renormalisation 
and factorization scale to = P^max' where Px^max is the Pt of the highest Pt jet in the required 
rapidity region^. 

In our test runs, to be independent from statistical fluctuations (which can be large in particular 
in the NLO case), we fill in addition to the grid a reference histogram in the standard way according to 
eq.E 

The choice of the grid architecture depends on the required accuracy, on the exact cross-section 
definition and on the available computer resources. Here, we will just sketch the influence of the grid 
architecture and the interpolation method on the final result. We will investigate an example where 
we calculate the inclusive jet cross-section in A'obs = 100 bins in the kinematic range 100 < Pt < 
5000 GeV. In future applications this can serve as guideline for a user to adapt the grid method to 
his/her specific problem. We believe that the code is transparent and flexible enough to adapt to many 
applications. 

As reference for comparisons of different grid architectures and interpolation methods we use the 
following: 

• Grid spacing in y{x): 10^^ < xi,X2 < 1.0 with Ny = 30 

• Grid spacing in t{Q'^): 100 GeV <Q < 5000 GeV with = 30 

• Order of interpolation: Uy = 3, rir = 3 

The grid boundaries correspond to the user setting for the first run which determines the grid boundaries 
for each observable bin. In the following we call this grid architecture 30^x30x100(3, 3). Such a grid 
takes about 300 Mbyte of computer memory. The root-file where the grid is stored has about 50 Mbyte. 

The result is shown in Fig. [^). The reference cross-section is reproduced everywhere to within 
0.05%. The typical precision is about 0.01%. At low and high Pt there is a positive bias of about 
0.04%. Also shown in Fig. [T]i) are the results obtained with different grid architectures. For a finer 
X grid (50^x30x100(3, 3)) the accuracy is further improved (within 0.005%) and there is no bias. A 
finer (30^x60x100(3, 3)) as wefl as a coarser (30^x10x100(3, 3)) binning in does not improve the 
precision. 

Fig. ^) and Fig. [Q;) show for the grid (30^x30x100) different interpolation methods. With an 
interpolation of order n = 5 the precision is 0.01% and the bias at low and high Pt observed for the 
n = 3 interpolation disappears. The result is similar to the one obtained with finer x-points. Thus by 

''We use the C++ wrapper of the LHAPDF interface [12]. 

'Note that beyond LO the PT,max will in general differ from the Pt of the other jets, so when binning an inclusive jet 
cross section, the Pt of a given jet may not correspond to the renormalisation scale chosen for the event as a whole. For this 
reason we shall need separate grid dimensions for the jet Pt and for the renormalisation scale. Only in certain moment-space 
approaches [2] has this requirement so far been efficiently circumvented. 



increasing the interpolation order the grid can be kept smaller. An order n = 1 interpolation gives a 
systematic negative bias of about 1% becoming even larger towards high Pt- 

Depending on the available computer resources and the specific problem, the user will have to 
choose a proper grid architecture. In this context, it is interesting that a very small grid 10^x10x100(5, 5) 
that takes only about 10 Mbyte computer memory reaches still a precision of 0.5%, if an interpolation of 
order n = 5 is used (see Fig.nji)). 
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Fig. 1: Ratio between the single inclusive jet cross-section with 100 Pt bins calculated with the grid technique and the 
reference cross-section calculated in the standard way. Shown are the standard grid, grids with finer x and sampling (a) 
with interpolation of order 1, 3 and 5 (b) (and on a finer scale in c)) and a small grid (d). 



5 Conclusions 

We have developed a technique to store the perturbative coefficients calculated by an NLO Monte Carlo 
program on a grid allowing for a-posteriori inclusion of an arbitrary parton density function (PDF) set. 
We extended a technique that was already successfully used to analyse HERA data to the more demand- 
ing case of proton-proton collisions at LHC energies. 

The technique can be used to constrain PDF uncertainties, e.g. at high momentum transfers, from 
data that will be measured at LHC and allows the consistent inclusion of final state observables in global 
QCD analyses. This will help increase the sensitivity of LHC to find new physics as deviations from the 
Standard Model predictions. 

Even for the large kinematic range for the parton momentum fractions xi and X2 and of the squared 
momentum transfer accessible at LHC, grids of moderate size seem to be sufficient. The single 
inclusive jet cross-section in the central region \y\ < 0.5 can be calculated with a precision of 0.01% 
in a realistic example with 100 bins in the transverse jet energy range 100 < Pt < 5000 GeV. In this 
example, the grid occupies about 300 Mbyte computer memory. With smaller grids of order 10 Mbyte 
the reachable accuracy is still 0.5%. This is probably sufficient for all practical applications. 
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